几十年来,对信用违约风险的预测一直是一个重要的研究领域。传统上,由于其准确性和解释性,逻辑回归被广泛认为是解决方案。作为最近的趋势,研究人员倾向于使用更复杂和高级的机器学习方法来提高预测的准确性。尽管某些非线性机器学习方法具有更好的预测能力,但通常认为它们缺乏金融监管机构的解释性。因此,它们尚未被广泛应用于信用风险评估中。我们引入了一个具有选择性选项的神经网络,以通过区分数据集来通过线性模型来解释,以提高可解释性。我们发现,对于大多数数据集,逻辑回归将足够,准确性合理。同时,对于某些特定的数据部分,浅神经网络模型可以提高精确度,而无需显着牺牲可解释性。
translated by 谷歌翻译
多年来,机器学习方法一直在各种领域(包括计算机视觉和自然语言处理)中使用。尽管机器学习方法比传统方法显着改善了模型性能,但它们的黑盒结构使研究人员难以解释结果。对于高度监管的金融行业,透明度,解释性和公平性同样重要,甚至比准确性重要。没有满足受管制要求的情况,即使是高度准确的机器学习方法也不太可能被接受。我们通过引入一种新颖的透明和可解释的机器学习算法来解决这个问题,称为神经添加剂模型的通用手套。神经添加剂模型的广义手套将特征分为三类:线性特征,单个非线性特征和相互作用的非线性特征。此外,最后类别中的交互仅是本地的。线性和非线性组件通过逐步选择算法区分,并通过应用加法分离标准仔细验证相互作用的组。经验结果表明,神经添加剂模型的广义手套可通过最简单的体系结构提供最佳的精度,从而可以采用高度准确,透明且可解释的机器学习方法。
translated by 谷歌翻译
数十年来,对信用违约风险的预测一直是一个积极的研究领域。从历史上看,逻辑回归由于遵守法规要求而被用作主要工具:透明度,解释性和公平性。近年来,研究人员越来越多地使用复杂和先进的机器学习方法来提高预测准确性。即使机器学习方法可以潜在地提高模型的准确性,但它会使简单的逻辑回归复杂化,会使解释性恶化并经常违反公平性。在没有法规要求的情况下,公司即使是高度准确的机器学习方法也不太可能被公司接受信用评分。在本文中,我们介绍了一类新颖的单调神经添加剂模型,这些模型通过简化神经网络体系结构并实施单调性来满足调节要求。通过利用神经添加剂模型的特殊体系结构特征,单调神经添加剂模型有效地违反了单调性。因此,训练的计算成本单调神经添加剂模型类似于训练神经添加剂模型的计算成本,作为免费午餐。我们通过经验结果证明,我们的新模型与Black-Box完全连接的神经网络一样准确,提供了一种高度准确且受调节的机器学习方法。
translated by 谷歌翻译
Accurate determination of a small molecule candidate (ligand) binding pose in its target protein pocket is important for computer-aided drug discovery. Typical rigid-body docking methods ignore the pocket flexibility of protein, while the more accurate pose generation using molecular dynamics is hindered by slow protein dynamics. We develop a tiered tensor transform (3T) algorithm to rapidly generate diverse protein-ligand complex conformations for both pose and affinity estimation in drug screening, requiring neither machine learning training nor lengthy dynamics computation, while maintaining both coarse-grain-like coordinated protein dynamics and atomistic-level details of the complex pocket. The 3T conformation structures we generate are closer to experimental co-crystal structures than those generated by docking software, and more importantly achieve significantly higher accuracy in active ligand classification than traditional ensemble docking using hundreds of experimental protein conformations. 3T structure transformation is decoupled from the system physics, making future usage in other computational scientific domains possible.
translated by 谷歌翻译
For Prognostics and Health Management (PHM) of Lithium-ion (Li-ion) batteries, many models have been established to characterize their degradation process. The existing empirical or physical models can reveal important information regarding the degradation dynamics. However, there is no general and flexible methods to fuse the information represented by those models. Physics-Informed Neural Network (PINN) is an efficient tool to fuse empirical or physical dynamic models with data-driven models. To take full advantage of various information sources, we propose a model fusion scheme based on PINN. It is implemented by developing a semi-empirical semi-physical Partial Differential Equation (PDE) to model the degradation dynamics of Li-ion-batteries. When there is little prior knowledge about the dynamics, we leverage the data-driven Deep Hidden Physics Model (DeepHPM) to discover the underlying governing dynamic models. The uncovered dynamics information is then fused with that mined by the surrogate neural network in the PINN framework. Moreover, an uncertainty-based adaptive weighting method is employed to balance the multiple learning tasks when training the PINN. The proposed methods are verified on a public dataset of Li-ion Phosphate (LFP)/graphite batteries.
translated by 谷歌翻译
Non-line-of-sight (NLOS) imaging aims to reconstruct the three-dimensional hidden scenes from the data measured in the line-of-sight, which uses photon time-of-flight information encoded in light after multiple diffuse reflections. The under-sampled scanning data can facilitate fast imaging. However, the resulting reconstruction problem becomes a serious ill-posed inverse problem, the solution of which is of high possibility to be degraded due to noises and distortions. In this paper, we propose two novel NLOS reconstruction models based on curvature regularization, i.e., the object-domain curvature regularization model and the dual (i.e., signal and object)-domain curvature regularization model. Fast numerical optimization algorithms are developed relying on the alternating direction method of multipliers (ADMM) with the backtracking stepsize rule, which are further accelerated by GPU implementation. We evaluate the proposed algorithms on both synthetic and real datasets, which achieve state-of-the-art performance, especially in the compressed sensing setting. All our codes and data are available at https://github.com/Duanlab123/CurvNLOS.
translated by 谷歌翻译
Masked image modeling (MIM) has shown great promise for self-supervised learning (SSL) yet been criticized for learning inefficiency. We believe the insufficient utilization of training signals should be responsible. To alleviate this issue, we introduce a conceptually simple yet learning-efficient MIM training scheme, termed Disjoint Masking with Joint Distillation (DMJD). For disjoint masking (DM), we sequentially sample multiple masked views per image in a mini-batch with the disjoint regulation to raise the usage of tokens for reconstruction in each image while keeping the masking rate of each view. For joint distillation (JD), we adopt a dual branch architecture to respectively predict invisible (masked) and visible (unmasked) tokens with superior learning targets. Rooting in orthogonal perspectives for training efficiency improvement, DM and JD cooperatively accelerate the training convergence yet not sacrificing the model generalization ability. Concretely, DM can train ViT with half of the effective training epochs (3.7 times less time-consuming) to report competitive performance. With JD, our DMJD clearly improves the linear probing classification accuracy over ConvMAE by 5.8%. On fine-grained downstream tasks like semantic segmentation, object detection, etc., our DMJD also presents superior generalization compared with state-of-the-art SSL methods. The code and model will be made public at https://github.com/mx-mark/DMJD.
translated by 谷歌翻译
Reinforcement learning (RL) is one of the most important branches of AI. Due to its capacity for self-adaption and decision-making in dynamic environments, reinforcement learning has been widely applied in multiple areas, such as healthcare, data markets, autonomous driving, and robotics. However, some of these applications and systems have been shown to be vulnerable to security or privacy attacks, resulting in unreliable or unstable services. A large number of studies have focused on these security and privacy problems in reinforcement learning. However, few surveys have provided a systematic review and comparison of existing problems and state-of-the-art solutions to keep up with the pace of emerging threats. Accordingly, we herein present such a comprehensive review to explain and summarize the challenges associated with security and privacy in reinforcement learning from a new perspective, namely that of the Markov Decision Process (MDP). In this survey, we first introduce the key concepts related to this area. Next, we cover the security and privacy issues linked to the state, action, environment, and reward function of the MDP process, respectively. We further highlight the special characteristics of security and privacy methodologies related to reinforcement learning. Finally, we discuss the possible future research directions within this area.
translated by 谷歌翻译
Detecting abrupt changes in data distribution is one of the most significant tasks in streaming data analysis. Although many unsupervised Change-Point Detection (CPD) methods have been proposed recently to identify those changes, they still suffer from missing subtle changes, poor scalability, or/and sensitive to noise points. To meet these challenges, we are the first to generalise the CPD problem as a special case of the Change-Interval Detection (CID) problem. Then we propose a CID method, named iCID, based on a recent Isolation Distributional Kernel (IDK). iCID identifies the change interval if there is a high dissimilarity score between two non-homogeneous temporal adjacent intervals. The data-dependent property and finite feature map of IDK enabled iCID to efficiently identify various types of change points in data streams with the tolerance of noise points. Moreover, the proposed online and offline versions of iCID have the ability to optimise key parameter settings. The effectiveness and efficiency of iCID have been systematically verified on both synthetic and real-world datasets.
translated by 谷歌翻译
In the new era of personalization, learning the heterogeneous treatment effect (HTE) becomes an inevitable trend with numerous applications. Yet, most existing HTE estimation methods focus on independently and identically distributed observations and cannot handle the non-stationarity and temporal dependency in the common panel data setting. The treatment evaluators developed for panel data, on the other hand, typically ignore the individualized information. To fill the gap, in this paper, we initialize the study of HTE estimation in panel data. Under different assumptions for HTE identifiability, we propose the corresponding heterogeneous one-side and two-side synthetic learner, namely H1SL and H2SL, by leveraging the state-of-the-art HTE estimator for non-panel data and generalizing the synthetic control method that allows flexible data generating process. We establish the convergence rates of the proposed estimators. The superior performance of the proposed methods over existing ones is demonstrated by extensive numerical studies.
translated by 谷歌翻译